Goto

Collaborating Authors

 order tensor product


Learning to Reason with Third Order Tensor Products

Neural Information Processing Systems

We combine Recurrent Neural Networks with Tensor Product Representations to learn combinatorial representations of sequential data. This improves symbolic interpretation and systematic generalisation. Our architecture is trained end-to-end through gradient descent on a variety of simple natural language reasoning tasks, significantly outperforming the latest state-of-the-art models in single-task and all-tasks settings. We also augment a subset of the data such that training and test data exhibit large systematic differences and show that our approach generalises better than the previous state-of-the-art.


Reviews: Learning to Reason with Third Order Tensor Products

Neural Information Processing Systems

Summary This paper presents a question-answering system based on tensor product representations. Given a latent sentence encoding, different MLPs extract entity and relation representations which are then used to update an tensor product representations of order-3 and trained end-to-end from the downstream success of correctly answering the question. Experiments are limited to bAbI question answering, which is disappointing as this is a synthetic corpus with a simple known underlying triples structure. While the proposed system outperforms baselines like recurrent entity networks (RENs) by a small difference in mean error, RENs have also been applied to more real-world tasks such as the Children's Book Test (CBT). Strengths - I like that the authors do not just report the best performance of their model, but also the mean and variance from five runs.


Learning to Reason with Third Order Tensor Products

Schlag, Imanol, Schmidhuber, Jürgen

Neural Information Processing Systems

We combine Recurrent Neural Networks with Tensor Product Representations to learn combinatorial representations of sequential data. This improves symbolic interpretation and systematic generalisation. Our architecture is trained end-to-end through gradient descent on a variety of simple natural language reasoning tasks, significantly outperforming the latest state-of-the-art models in single-task and all-tasks settings. We also augment a subset of the data such that training and test data exhibit large systematic differences and show that our approach generalises better than the previous state-of-the-art. Papers published at the Neural Information Processing Systems Conference.